Inferring phylogeny from whole genomes

نویسندگان

  • Pawel Górecki
  • Jerzy Tiuryn
چکیده

MOTIVATION Inferring species phylogenies with a history of gene losses and duplications is a challenging and an important task in computational biology. This problem can be solved by duplication-loss models in which the primary step is to reconcile a rooted gene tree with a rooted species tree. Most modern methods of phylogenetic reconstruction (from sequences) produce unrooted gene trees. This limitation leads to the problem of transforming unrooted gene tree into a rooted tree, and then reconciling rooted trees. The main questions are 'What about biological interpretation of choosing rooting?', 'Can we find efficiently the optimal rootings?', 'Is the optimal rooting unique?'. RESULTS In this paper we present a model of reconciling unrooted gene tree with a rooted species tree, which is based on a concept of choosing rooting which has minimal reconciliation cost. Our analysis leads to the surprising property that all the minimal rootings have identical distributions of gene duplications and gene losses in the species tree. It implies, in our opinion, that the concept of an optimal rooting is very robust, and thus biologically meaningful. Also, it has nice computational properties. We present a linear time and space algorithm for computing optimal rooting(s). This algorithm was used in two different ways to reconstruct the optimal species phylogeny of five known yeast genomes from approximately 4700 gene trees. Moreover, we determined locations (history) of all gene duplications and gene losses in the final species tree. It is interesting to notice that the top five species trees are the same for both methods. AVAILABILITY Software and documentation are freely available from http://bioputer.mimuw.edu.pl/~gorecki/urec

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring Ancestral Chloroplast Genomes with Inverted Repeat

Genome evolution is shaped not only by nucleotide substitutions, but also by structural changes including gene and genome duplications, insertions/deletions and gene order rearrangements. Reconstruction of phylogeny based on gene order changes has been limited to cases where equal gene content or few deletions can be assumed. Since conserved duplicated regions are present in many Chloroplast ge...

متن کامل

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

A More Accurate and Efficient Whole Genome Phylogeny

To reconstruct a phylogeny for a given set of species, most of the previous approaches are based on the similarity information derived from a subset of conserved regions (or genes) in the corresponding genomes. In some cases, the regions chosen may not reflect the evolutionary history of the species and may be too restricted to differentiate the species. It is generally believed that the infere...

متن کامل

TIBA: a tool for phylogeny inference from rearrangement data with bootstrap analysis

TIBA is a tool to reconstruct phylogenetic trees from rearrangement data that consist of ordered lists of synteny blocks (or genes), where each synteny block is shared with all of its homologues in the input genomes. The evolution of these synteny blocks, through rearrangement operations, is modelled by the uniform Double-Cut-and-Join model. Using a true distance estimate under this model and s...

متن کامل

Advances in phylogeny reconstruction from gene order and content data.

Genomes can be viewed in terms of their gene content and the order in which the genes appear along each chromosome. Evolutionary events that affect the gene order or content are "rare genomic events" (rarer than events that affect the composition of the nucleotide sequences) and have been advocated by systematists for inferring deep evolutionary histories. This chapter surveys recent developmen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 23 2  شماره 

صفحات  -

تاریخ انتشار 2007